2024 Voter Turnout: Hotspots

spatial data
data cleaning
visualisation
In this post, we walk-through a hotspot analysis of voter turnout difference between the 2019 - 2024 National and Provincial government elections in South Africa.
Author

Sivuyile Nzimeni

Published

17 June 2024

see code
lapply(c("tidyverse","janitor","tidymodels","sf",
         "spdep","ggthemes","tmap","showtext"),
       require,
       character.only = TRUE) |> 
  suppressWarnings() |> 
  suppressMessages() |> 
  invisible()

theme_set(theme_minimal())
sf_use_s2(FALSE)

1 INTRODUCTION

While most media coverage as focused on the coalition negotiations1. There is an equally interesting development from the 2024 South African National and Provincial Election results, namely voter turnout. In successive elections cycles since the 1999 election cycle, the country recorded a diminishing voter turnout. Turnout peaked in 1999 at 89.3% and the latest cycle registered the lowest voter turn-out yet at 58.6% (O’Regan 2024). There are bound to be spatial variations in voting patterns and turnout as well.

In this post, we will analyse voter turnout differences at ward-level. This presents several challenges for analysing this dataset. Firstly, voting districts and wards are mutable. In other words, they are subject to change from one election cycle to the next. As a consequence, we will can either account for changes from previous ward demarcations to their current version or impute some other value to measure the differences between the 2019 - 2024 National and Provincial Election.

Secondly, we do not include Out-of-Country votes in the analysis as those votes are not linked to a ward. Out-of-Country do not include Provincial nor Regional ballots. (The Electoral Court of South Africa 2024)’s decision as it relates to honorary consulates, high commissions and consulates role as voting stations. Effectively, the decision introduces a large set of new voting station remarkably different to previous election cycles.

Voter Turnout

Voter turnout is defined as a proportion \(voter turnout = (100/registered population)*total votes\). It is applied to both election years and the turnout difference is effectively \(voter turnout(2024) - voterturnout(2021)\).

1.1 COLLECTING DATA

In order to collect the required data, we rely on two main data sources. The Municipal Demarcation Board, they are the body responsible for drawing districts throughout the country. In turn, the Independent Electoral Commission of South Africa can determine the appropriate voting districts (voting station boundaries). The IEC is unambiguous about the independence of the voting districts from the work of the Municipal Demarcation Board. Voting districts are logistically sound regions aimed at minimising voter inconvenience and limiting voter fraud ‘About Voting Districts and Stations - Electoral Commission of South Africa’ (n.d.).

Unlike previous years, sourcing voting districts and voting station coordinates has proved markedly more difficult in 2024. Fortunately, SANEF’s election dashboard ‘Elections Dashboard » SANEF Elections Portal 2024’ (2024) has a handy data export feature. The voting station location data can be join to their respective wards, the voting station results are aggregated to ward level for the 2019 and 2024 elections.

There are a few sanity checks in the data preprocessing, such as excluding newly demarcated wards, since they don’t have a 2019 baseline. Voting Stations with turnout greater 100% are removed. This pattern is glaring particularly at voting stations that were in temporary structures such as Tents. We do not include data from Provincial and Regional ballots since we aren’t necessarily interested in voting patterns per say but rather whether voters showed up.

The final dataset contains an sf object with the aggregate turnout results across wards in both the 2019 and the 2024 National and Provincial Elections. Our variable of interest is the turnout change from 2019 - 2024.

2 EXPLORATORY ANALYSIS

see code
TurnOut|>
  ggplot()+
  geom_sf(aes(fill = turnout_diff),
          color = "black")+
  scale_fill_viridis_c(breaks = c(-40,-20,0,20,40))+
  labs(subtitle = "2019 - 2024 Turnout Difference (%)",
       fill = "Turnout Difference(%)")+
  theme_void()+
  theme(
    text = element_text(family = "IBM Plex Sans"),
    plot.title = element_text(face = "bold",
                              hjust = 0.5),
    plot.subtitle = element_text(face = "italic",hjust =0.5),
    legend.position = "bottom"
  )
Figure 1: 2024 National and Provincial Election Results

Figure 1 illustrates the turn out differences across wards. There are at least 4454 wards, as a result, insights are lost in the noise. For example, the metropolitan areas are indistinguishable from the rest of the country, their differences are hidden by ward boundaries. It is possible to ‘zoom’ into these areas of interest.

Figure 2: 2019 - 2024 Metropolitan Municipality Turnout Differences (%)
(a) Cape Town
(b) Johannesburg
(c) Gqeberha
(d) Tshwane
(e) Durban

Figure 2 illustrates differences in turnout across five metropolitan municipalities. This approach provides a more granular view of outcomes while focusing on regions with higher population densities. Some distinct patterns emerge at a ward-level and metropolitan-level. One approach to quantifying these patterns is to do hotspot analysis. Effectively, we can rely on a number of statistics to assess spatial autocorrelation. Kopczewska (2021), pp. 149-211 provides a succinct summary of the spatial autocorrelation, global and local statistics and their visualisation.

2.1 CREATING HOTSPOTS

In the code below, we complete a couple of tasks, first we create neighbours list from the polygons of ward districts using the poly2nb . Next, the neighbours lists are converted to spatial weights (nb2listw) and lagged (lag.listw).

The lagged spatial weights are used as input in the estimation of a local spatial statistic (Getis-Ord G) which will help us identify clusters of high or low voter turnout. The hotspot function helps us classify whether the patterns observed are of interest. Finally, we can visualise results.

see code
G_Local <- localG_perm(TurnOut$turnout_diff,
                  Spatial_Listw)

G_Local_Classy <- hotspot(G_Local,
        Prname = "Pr(z != E(Gi))",
        cutoff = 0.05,
        p.adjust = "none")

G_Local_Classy |> table()
see code
TurnOut$hotspot_classification <- G_Local_Classy

TurnOut |> 
  ggplot()+
  geom_sf(aes(fill =hotspot_classification),
          color = "black")+
  scale_fill_manual(
    values = c("High" = "#0f204b",
               "Low" = "#A71930")
  )+
  labs(fill = "Hotspot Classification")+
  theme_void()+
    theme(
      text = element_text("IBM Plex Sans"),
      plot.title = element_text(face = "bold",hjust = 0.5),
      plot.subtitle = element_text(face = "italic",hjust = 0.5),
      legend.position = "bottom"
    )
Figure 3: 2019 - 2024 Voter Turnout Hotspots
G_Local_Classy
 Low High 
 401  314 

Figure 3 illustrates a map of the hotspots throughout South Africa. However, we have the same flaw observed in with Figure 1, the hotspots are sparsely distributed throughout the country. As such, it can be difficult to extract meaningful information out of the visualisation. In addition, the map as-is does not contain any additional information such as cities,built-environment, roads etc.

Accordingly, we enhance the visualisation using the rdeck package North (2024) which offers an interaction to the mapbox visualisation capabilities. The code below is adapted from Walker (2024) .

MAPBOX ACCESS TOKEN

The mapbox service requires an account and access token. It offers a generous free-tier.

2.2 INTERACTIVE VISUALISATION

see code
library(rdeck)
library(mapdeck)
library(viridisLite)

TurnOut_Subset <- TurnOut |> 
  filter(!is.na(hotspot_classification))

rdeck(map_style = mapbox_satellite_streets(),
      initial_view_state = view_state(
        center = c(24.0850297,-29.6978701),
        zoom = 5))|>
  add_polygon_layer(
    data = TurnOut_Subset,
    pickable = TRUE,
    visible = TRUE,
    get_polygon = geometry,
    opacity = 0.6,
    get_fill_color = rdeck::scale_color_category(
      col = hotspot_classification,
      palette = cividis(n = table(TurnOut$hotspot_classification)|>length(),
      direction = -1)
      )
    )
Figure 4: Interactive Hotspots

3 CONCLUSION

Our primary aim was to assess turnout differences from the 2019 - 2024 National and Provincial Elections in South Africa. This involved some data preprocessing, merging and aggregation of turnout for each election cycle. After some initial visualisations, we relied on the local Getis-Ord G statistic in order to find clusters of hotspots. The final visualisation is interactive including a satellite image of South Africa for added context.

This walk-through is fairly superficial, we do not include covariates to measure differences in turnout, nor do we consider events that may have occurred in those regions. Rule (2018), Fransman and Fintel (2024) and others have considered a broader spectrum of variables that could explain voting patterns.

References

‘About Voting Districts and Stations - Electoral Commission of South Africa’. n.d. https://www.elections.org.za/pw/Voters-Roll/About-voting-districts-and-stations.
‘Elections Dashboard » SANEF Elections Portal 2024’. 2024. https://elections.sanef.org.za/dashboard/.
Fransman, Tina, and Marisa von Fintel. 2024. ‘Voting and Protest Tendencies Associated with Changes in Service Delivery’. Development Southern Africa 41 (1): 7190. https://doi.org/10.1080/0376835X.2023.2252456.
Kopczewska, Katarzyna, ed. 2021. Applied Spatial Statistics and Econometrics: Data Analysis in r. Routledge Advanced Texts in Economics and Finance. Milton Park, Abingdon, Oxon ; New York, NY: Routledge Taylor & Francis Group.
North, Anthony. 2024. Rdeck: Deck.gl Widget. R Package Version. https://github.com/qfes/rdeck.
O’Regan, Victoria. 2024. ‘Over 11 Million Registered Voters Did Not Cast Ballots in SA Polls’. https://www.dailymaverick.co.za/article/2024-06-07-the-big-no-vote-over-11-million-registered-voters-did-not-cast-ballots-in-2024-polls/.
Rule, S. P. 2018. ‘Geography and Voting: The Growth of Urban Opposition in South Africa Two Decades After Democratisation’. South African Geographical Journal 100 (2): 141161. https://doi.org/10.1080/03736245.2017.1339628.
The Electoral Court of South Africa. 2024. ‘Democratic Alliance and Another v Electoral Commission of South Africa and Others’, April. https://www.saflii.org/za/cases/ZAEC/2024/6.pdf.
Walker, Kyle. 2024. ‘WALKER DATA Getting and Visualizing Overture Maps Buildings Data in R’. https://walker-data.com/posts/overture-buildings/.

Footnotes

  1. See Bloomberg, DailyMaverick and Others↩︎